Efficient Skyline Computation over Ad-hoc Aggregations
نویسندگان
چکیده
Aggregation is among the core functionalities of OLAP systems. Frequently, such queries are issued in decision support systems to identify interesting groups of data. When more than one aggregation function is involved and the notion of interest is not clearly defined, skyline queries provide a robust mechanism to capture the potentially interesting points where (i) users do not need to specify a ranking function and (ii) the result is independent of the dimension scales. For providing better exploration functionalities in the OLAP system, in this paper, we propose to use skyline queries over aggregated data to identify the most interesting groups. Since the aggregation function has to be ad-hoc to cover a wide variety of user interests, the skyline over the aggregates has to be computed on the fly. Hence any algorithm to compute such a skyline must be fast and be able to progressively produce the result set with potential skyline groups being produced as early as possible. We explore a family of algorithms which try to consume only as many data records as are necessary to compute the skyline and design an optimal algorithm. We further refine the algorithm by taking into account systems issues such as disk behavior which are often ignored but have strong impact on real system performance. Experimental results validate the performance and progressive benefits of our algorithm.
منابع مشابه
Mining Thick Skylines over Large Databases
People recently are interested in a new operator, called skyline [3], which returns the objects that are not dominated by any other objects with regard to certain measures in a multi-dimensional space. Recent work on the skyline operator [3, 15, 8, 13, 2] focuses on efficient computation of skylines in large databases. However, such work gives users only thin skylines, i.e., single objects, whi...
متن کاملSkyline Evaluation Within Join Operation, Block Nested Loop Join Implementation
Skyline Join approach in its Naïve age work as it computes join first and then apply skyline computation to find corresponding skyline objects. Considering increase in cardinality and dimensionality of join table the cost of computing skyline in a non-reductive join relation is costlier than that of on single table. Most of the existing work on skyline queries for databases mainly discusses the...
متن کاملBroadcast Routing in Wireless Ad-Hoc Networks: A Particle Swarm optimization Approach
While routing in multi-hop packet radio networks (static Ad-hoc wireless networks), it is crucial to minimize power consumption since nodes are powered by batteries of limited capacity and it is expensive to recharge the device. This paper studies the problem of broadcast routing in radio networks. Given a network with an identified source node, any broadcast routing is considered as a directed...
متن کاملSemi-Skylines and Skyline-Snippets
Skyline evaluation techniques (also known as Pareto preference queries) follow a common paradigm that eliminates data elements by finding other elements in the data set that dominate them. To date already a variety of sophisticated skyline evaluation techniques are known, hence skylines are considered a well researched area. Nevertheless, in this paper we come up with interesting new aspects. O...
متن کاملUNIVERSITÄT AUGSBURG Semi-Skylines and Skyline Snippets
Skyline evaluation techniques (also known as Pareto preference queries) follow a common paradigm that eliminates data elements by finding other elements in the data set that dominate them. To date already a variety of sophisticated skyline evaluation techniques are known, hence skylines are considered a well researched area. Nevertheless, in this paper we come up with interesting new aspects. O...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008